Nhtht
(i)Theological Justifications The post-medieval period coincided with a gradual transition from theology to science as the predominant means of revealing the workings of nature. In many cases, espoused principles of parsimony continued to wear their theological origins on their sleeves, as with Leibniz's thesis that God has created the best and most complete of all possible worlds, and his linking of this thesis to simplifying principles such as light always taking the (time-wise) shortest path. A similar attitude—and rhetoric—is shared by scientists through the early modern and modern period, including Kepler, Newton, and Maxwell. Some of this rhetoric has survived to the present day, especially among theoretical physicists and cosmologists such as Einstein and Hawking.http://plato.stanford.edu/entries/simplicity/notes.html#10 10 Yet there are clear dangers with relying on a theological justification of simplicity principles. Firstly, many—probably most—contemporary scientists are reluctant to link methodological principles to religious belief in this way. Secondly, even those scientists who do talk of ‘God’ often turn out to be using the term metaphorically, and not necessarily as referring to the personal and intentional Being of monotheistic religions. Thirdly, even if there is a tendency to justify simplicity principles via some literal belief in the existence of God, such justification is only rational to the extent that rational arguments can be given for the existence of God.http://plato.stanford.edu/entries/simplicity/notes.html#11 11 For these reasons, few philosophers today are content to rest with a theological justification of simplicity principles. Yet there is no doubting the influence such justifications have had on past and present attitudes to simplicity. As Smart (1994) writes: There is a tendency…for us to take simplicity…as a guide to metaphysical truth. Perhaps this tendency derives from earlier theological notions: we expect God to have created a beautiful universe (Smart 1984, p. 121). (ii) Metaphysical Justifications One approach to justifying simplicity principles is to embed such principles in some more general metaphysical framework. Perhaps the clearest historical example of systematic metaphysics of this sort is the work of Leibniz. The leading contemporary example of this approach—and in one sense a direct descendent of Leibniz's methodology—is the possible worlds framework of David Lewis. In one of his earlier works, Lewis writes, I subscribe to the general view that qualitative parsimony is good in a philosophical or empirical hypothesis (Lewis 1973, p. 87). Lewis has been attacked for not saying more about what exactly he takes simplicity to be (see Woodward 2003). However, what is clear is that simplicity plays a key role in underpinning his metaphysical framework, and is also taken to be a prima facie theoretical virtue. (iii) ‘Intrinsic Value’ Justifications Some philosophers have approached the issue of justifying simplicity principles by arguing that simplicity has intrinsic value as a theoretical goal. Sober, for example, writes: Just as the question ‘why be rational?’ may have no non-circular answer, the same may be true of the question ‘why should simplicity be considered in evaluating the plausibility of hypotheses?’ (Sober 2001, p. 19). Such intrinsic value may be ‘primitive’ in some sense, or it may be analyzable as one aspect of some broader value. For those who favor the second approach, a popular candidate for this broader value is aesthetic. Derkse 1992 is a book-length development of this idea, and echoes can be found in Quine's remarks—in connection with his defense of Occam's Razor—concerning his taste for “clear skies” and “desert landscapes.” In general, forging a connection between aesthetic virtue and simplicity principles seems better suited to defending methodological rather than epistemic principles. (iv) Justifications via Principles of Rationality Another approach is to try to show how simplicity principles follow from other better established or better understood principles of rationality.http://plato.stanford.edu/entries/simplicity/notes.html#12 12 For example, some philosophers just stipulate that they will take ‘simplicity’ as shorthand for whatever package of theoretical virtues is (or ought to be) characteristic of rational inquiry. A more substantive alternative is to link simplicity to some particular theoretical goal, for example unification (see Friedman 1983). While this approach might work for elegance, it is less clear how it can be maintained for ontological parsimony. Conversely, a line of argument which seems better suited to defending parsimony than to defending elegance is to appeal to a principle of epistemological conservatism. Parsimony in a theory can be viewed as minimizing the number of ‘new’ kinds of entities and mechanisms which are postulated. This preference for old mechanisms may in turn be justified by a more general epistemological caution, or conservatism, which is characteristic of rational inquiry. Note that the above style of approach can be given both a rationalist and an empiricist gloss. If unification, or epistemological conservatism, are themselves a priori rational principles, then simplicity principles stand to inherit this feature if this approach can be carried out successfully. However, philosophers with empiricist sympathies may also pursue analysis of this sort, and then justify the base principles either inductively from past success or naturalistically from the fact that such principles are in fact used in science. To summarize, the main problem with a priori justifications of simplicity principles is that it can be difficult to distinguish between an a priori defense and no defense(!). Sometimes the theoretical virtue of simplicity is invoked as a primitive, self-evident proposition that cannot be further justified or elaborated upon. (One example is the beginning of Goodman and Quine's 1947 paper, where they state that their refusal to admit abstract objects into their ontology is “based on a philosophical intuition that cannot be justified by appeal to anything more ultimate.”) (Goodman & Quine 1947, p. 174). It is unclear where leverage for persuading skeptics of the validity of such principles can come from, especially if the grounds provided are not themselves to beg further questions. Misgivings of this sort have led to a shift away from justifications rooted in ‘first philosophy’ towards approaches which engage to a greater degree with the details of actual practice, both scientific and statistical. These other approaches will be discussed in the next two sections. 4. Naturalistic Justifications of Simplicity The rise of naturalized epistemology as a movement within analytic philosophy in the second half of the 20th Century has largely sidelined the rationalist style of approach. From the naturalistic perspective, philosophy is conceived of as continuous with science, and not as having some independently privileged status. The perspective of the naturalistic philosopher may be broader, but her concerns and methods are not fundamentally different from those of the scientist. The conclusion is that science neither needs—nor can legitimately be given—external philosophical justification. It is against this broadly naturalistic background that some philosophers have sought to provide an epistemic justification of simplicity principles, and in particular principles of ontological parsimony such as Occam's Razor. The main empirical evidence bearing on this issue consists of the patterns of acceptance and rejection of competing theories by working scientists. Einstein's development of Special Relativity—and its impact on the hypothesis of the existence of the electromagnetic ether—is one of the episodes most often cited (by both philosophers and scientists) as an example of Occam's Razor in action (see Sober 1981, p. 153). The ether is by hypothesis a fixed medium and reference frame for the propagation of light (and other electromagnetic waves). The Special Theory of Relativity includes the radical postulate that the speed of a light ray through a vacuum is constant relative to an observer no matter what the state of motion of the observer. Given this assumption, the notion of a universal reference frame is incoherent. Hence Special Relativity implies that the ether does not exist. This episode can be viewed as the replacement of an empirically adequate theory (the Lorentz-Poincaré theory) by a more ontologically parsimonious alternative (Special Relativity). Hence it is often taken to be an example of Occam's Razor in action. The problem with using this example as evidence for Occam's Razor is that Special Relativity (SR) has several other theoretical advantages over the Lorentz-Poincaré (LP) theory in addition to being more ontologically parsimonious. Firstly, SR is a simpler and more unified theory than LP, since in order to ‘save the phenomena’ a number of ad hoc and physically unmotivated patches had been added to LP. Secondly, LP raises doubts about the physical meaning of distance measurements. According to LP, a rod moving with velocity, v, contracts by a factor of (1 − v2/c2)1/2. Thus only distance measurements that are made in a frame at rest relative to the ether are valid without modification by a correction factor. However, LP also implies that motion relative to the ether is in principle undetectable. So how is distance to be measured? In other words, the issue here is complicated by the fact that—according to LP—the ether is not just an extra piece of ontology but an undetectable extra piece. Given these advantages of SR over LP, it seems clear that the ether example is not merely a case of ontological parsimony making up for an otherwise inferior theory. A genuine test-case for Occam's Razor must involve an ontologically parsimonious theory which is not clearly superior to its rivals in other respects. An instructive example is the following historical episode from biogeography, a scientific subdiscipline which originated towards the end of the 18thCentury, and whose central purpose was to explain the geographical distribution of plant and animal species.http://plato.stanford.edu/entries/simplicity/notes.html#12 13 In 1761, the French naturalist Buffon proposed the following law; (BL) Areas separated by natural barriers have distinct species. There were also known exceptions to Buffon's Law, for example remote islands which share (so-called) ‘cosmopolitan’ species with continental regions a large distance away. Two rival theories were developed to explain Buffon's Law and its occasional exceptions. According to the first theory, due to Darwin and Wallace, both facts can be explained by the combined effects of two causal mechanisms—dispersal, and evolution by natural selection. The explanation for Buffon's Law is as follows. Species gradually migrate into new areas, a process which Darwin calls “dispersal.” As natural selection acts over time on the contingent initial distribution of species in different areas, completely distinct species eventually evolve. The existence of cosmopolitan species is explained by “improbable dispersal,” Darwin's term for dispersal across seemingly impenetrable barriers by “occasional means of transport” such as ocean currents, winds, and floating ice. Cosmopolitan species are explained as the result of improbable dispersal in the relatively recent past. In the 1950's, Croizat proposed an alternative to the Darwin-Wallace theory which rejects their presupposition of geographical stability. Croizat argues that tectonic change, not dispersal, is the principal causal mechanism which underlies Buffon's Law. Forces such as continental drift, the submerging of ocean floors, and the formation of mountain ranges have acted within the time frame of evolutionary history to create natural barriers between species where at previous times there were none. Croizat's theory was the sophisticated culmination of a theoretical tradition which stretched back to the late 17th Century. Followers of this so-called “extensionist” tradition had postulated the existence of ancient land bridges to account for anomalies in the geographical distribution of plants and animals.http://plato.stanford.edu/entries/simplicity/notes.html#14 14 Extensionist theories are clearly less ontologically parsimonious than Dispersal Theories, since the former are committed to extra entities such as land bridges or movable tectonic plates. Moreover, Extensionist theories were (given the evidence then available) not manifestly superior in other respects. Darwin was an early critic of Extensionist theories, arguing that they went beyond the “legitimate deductions of science.” Another critic of Extensionist theories pointed to their “dependence on ad hoc hypotheses, such as land bridges and continental extensions of vast extent, to meet each new distributional anomaly” (Fichman 1977, p. 62) The debate over the more parsimonious Dispersal theories centered on whether the mechanism of dispersal is sufficient on its own to explain the known facts about species distribution, without postulating any extra geographical or tectonic entities. The criticisms leveled at the Extensionist and Dispersal theories follow a pattern that is characteristic of situations in which one theory is more ontologically parsimonious than its rivals. In such situations the debate is typically over whether the extra ontology is really necessary in order to explain the observed phenomena. The less parsimonious theories are condemned for profligacy, and lack of direct evidential support. The more parsimonious theories are condemned for their inadequacy to explain the observed facts. This illustrates a recurring theme in discussions of simplicity—both inside and outside philosophy—namely, how the correct balance between simplicity and goodness of fit ought to be struck. This theme takes center stage in the statistical approaches to simplicity discussed in Section 5. Less work has been done on describing episodes in science where elegance—as opposed to parsimony—has been (or may have been) the crucial factor. This may just reflect the fact that considerations linked to elegance are so pervasive in scientific theory choice as to be unremarkable as a topic for special study. A notable exception to this general neglect is the area of celestial mechanics, where the transition from Ptolemy to Copernicus to Kepler to Newton is an oft-cited example of simplicity considerations in action, and a case study which makes much more sense when seen through the lens of elegance rather than of parsimony.http://plato.stanford.edu/entries/simplicity/notes.html#15 15 Naturalism depends on a number of presuppositions which are open to debate. But even if these presuppositions are granted, the naturalistic project of looking to science for methodological guidance within philosophy faces a major difficulty, namely how to ‘read off’ from actual scientific practice what the underlying methodological principles are supposed to be. Burgess, for example, argues that what the patterns of scientific behavior show is not a concern with multiplying entitiesper se, but a concern more specifically with multiplying ‘causal mechanisms’ (Burgess 1998). And Sober considers the debate in psychology over psychological egoism versus motivational pluralism, arguing that the former theory postulates fewer types of ultimate desire but a larger number of causal beliefs, and hence that comparing the parsimony of these two theories depends on what is counted and how (Sober 2001, pp. 14–5). Some of the concerns raised in Sections 1 and 2 also reappear in this context; for example, how the world is sliced up into kinds effects the extent to which a given theory ‘multiplies’ kinds of entity. Justifying a particular way of slicing becomes more difficult once the epistemological naturalist leaves behind the a priori, metaphysical presuppositions of the rationalist approach. One philosophical debate where these worries over naturalism become particularly acute is the issue of the application of parsimony principles to abstract objects. The scientific data is—in an important sense—ambiguous. Applications of Occam's Razor in science are always to concrete, causally efficacious entities, whether land-bridges, unicorns, or the luminiferous ether. Perhaps scientists apply an unrestricted version of Occam's Razor to that portion of reality in which they are interested, namely the concrete, causal, spatiotemporal world. Or perhaps scientists apply a ‘concretized’ version of Occam's Razor unrestrictedly. Which is the case? The answer determines which general philosophical principle we end up with: ought we to avoid the multiplication of objects of whatever kind, or merely the multiplication of concrete objects? The distinction here is crucial for a number of central philosophical debates. Unrestricted Occam's Razor favors monism over dualism, and nominalism over platonism. By contrast, ‘concretized’ Occam's Razor has no bearing on these debates, since the extra entities in each case are not concrete. 5. Probabilistic/Statistical Justifications of Simplicity The two approaches discussed in Sections 3 and 4—a priori rationalism and naturalized empiricism—are both in some sense extreme. Simplicity principles are taken either to have no empirical grounding, or to have solely empirical grounding. Perhaps as a result, both these approaches yield vague answers to certain key questions about simplicity. In particular, neither seems equipped to answer how exactly simplicity ought to be balanced against empirical adequacy. Simple but wildly inaccurate theories are not hard to come up with. Nor are accurate theories which are highly complex. But how much accuracy should be sacrificed for a gain in simplicity? The black-and-white boundaries of the rationalism/empiricism divide may not provide appropriate tools for analyzing this question. In response, philosophers have recently turned to the mathematical framework of probability theory and statistics, hoping in the process to combine sensitivity to actual practice with the ‘trans-empirical’ strength of mathematics. Philosophically influential early work in this direction was done by Jeffreys and by Popper, both of whom tried to analyze simplicity in probabilistic terms. Jeffreys argued that “the simpler laws have the greater prior probability,” and went on to provide an operational measure of simplicity, according to which the prior probability of a law is 2−k, where k = order + degree + absolute values of the coefficients, when the law is expressed as a differential equation (Jeffreys 1961, p. 47). A generalization of Jeffreys' approach is to look not at specific equations, but at families of equations. For example, one might compare the family, LIN, of linear equations (of the form y = a + bx) with the family, PAR, of parabolic equations (of the form y = a + bx + cx2). Since PAR is of higher degree than LIN, Jeffreys' proposal assigns higher probability to LIN. Laws of this form are intuitively simpler (in the sense of being more elegant). Popper 1959 pointed out that Jeffreys' proposal, as it stands, contradicts the axioms of probability. Every member of LIN is also a member of PAR, where the coefficient, c, is set to 0. Hence ‘Law, L, is a member of LIN’ entails ‘Law, L, is a member of PAR.’ Jeffreys' approach assigns higher probability to the former than the latter. But it follows from the axioms of probability that when Aentails B, the probability of B is greater than or equal to the probability of A. Popper argues, in contrast to Jeffreys, that LIN has lower prior probability than PAR. Hence LIN is—in Popper's sense—more falsifiable, and hence should be preferred as the default hypothesis. One response to Popper's objection is to amend Jeffrey's proposal and restrict members of PAR to equations where c≠ 0. More recent work on the issue of simplicity has borrowed tools from statistics as well as from probability theory. It should be noted that the literature on this topic tends to use the terms ‘simplicity’ and ‘parsimony’ more-or-less interchangeably (see Sober 2003). But, whichever term is preferred, there is general agreement among those working in this area that simplicity is to be cashed out in terms of the number of free (or ‘adjustable’) parameters of competing hypotheses. Thus the focus here is totally at the level of theory. Philosophers who have made important contributions to this approach include Forster and Sober 1994 and Lange 1995. The standard case in the statistical literature on parsimony concerns curve-fitting.http://plato.stanford.edu/entries/simplicity/notes.html#16 16 We imagine a situation in which we have a set of discrete data points and are looking for the curve (i.e. function) which has generated them. The issue of what family of curves the answer belongs in (e.g. in LIN or in PAR) is often referred to as model-selection. The basic idea is that there are two competing criteria for model selection—parsimony and goodness of fit. The possibility of measurement error and ‘noise’ in the data means that the correct curve may not go through every data point. Indeed, if goodness of fit were the only criterion then there would be a danger of ‘overfitting’ the model to accidental discrepancies unrepresentative of the broader regularity. Parsimony acts as a counterbalance to such overfitting, since a curve passing through every data point is likely to be very convoluted and hence have many adjusted parameters. If proponents of the statistical approach are in general agreement that simplicity should be cashed out in terms of number of parameters, there is less unanimity over what the goal of simplicity principles ought to be. This is partly because the goal is often not made explicit. (An analogous issue arises in the case of Occam's Razor. ‘Entities are not to be multiplied beyond necessity.’ But necessity for what, exactly?) Forster 2001 distinguishes two potential goals of model selection, namely probable truth and predictive accuracy, and claims that these are importantly distinct (Forster 2001, p. 95). Forster argues that predictive accuracy tends to be what scientists care about most. They care less about the probability of an hypothesis being exactly right than they do about it having a high degree of accuracy. One reason for investigating statistical approaches to simplicity is a dissatisfaction with the vagaries of the a priori and naturalistic approaches. Statisticians have come up with a variety of numerically specific proposals for the trade-off between simplicity and goodness of fit. However, these alternative proposals disagree about the ‘cost’ associated with more complex hypotheses. Two leading contenders in the recent literature on model selection are the Akaike Information Criterion AIC and the Bayesian Information Criterion BIC. AIC directs theorists to choose the model with the highest value of {log L(Θk)/n} − k/n, where Θk is the best-fitting member of the class of curves of polynomial degree k, log L is log-likelihood, and n is the sample size. By contrast, BIC maximizes the value of {log L(Θk)/n} − klogn/2n. In effect, BIC gives an extra positive weighting to simplicity by a factor of logn/2 (where n is the size of the sample).http://plato.stanford.edu/entries/simplicity/notes.html#17 17 Extreme answers to the trade-off problem seem to be obviously inadequate. Always picking the model with the best fit to the data, regardless of its complexity, faces the prospect (mentioned earlier) of ‘overfitting’ error and noise in the data. Always picking the simplest model, regardless of its fit to the data, cuts the model free from any link to observation or experiment. Forster associates the ‘Always Complex’ and the ‘Always Simple’ rule with empiricism and rationalism respectively.http://plato.stanford.edu/entries/simplicity/notes.html#18 18 All the candidate rules that are seriously discussed by statisticians fall in between these two extremes. Yet they differ in their answers over how much weight to give simplicity in its trade-off against goodness of fit. In addition to AIC and BIC, other rules include Neyman-Pearson hypothesis testing, and the minimum description length (MDL) criterion. There are at least three possible responses to the varying answers to the trade-off problem provided by different criteria. One response, favored by Forster and by Sober, is to argue that there is no genuine conflict here because the different criteria have different aims. Thus AIC and BIC might both be optimal criteria, if AIC is aiming to maximize predictive accuracy whereas BIC is aiming to maximize probable truth. Another difference that may influence the choice of criterion is whether the goal of the model is to extrapolate beyond given data or interpolate between known data points. A second response, typically favored by statisticians, is to argue that the conflict is genuine but that it has the potential to be resolved by analyzing (using both mathematical and empirical methods) which criterion performs best over the widest class of possible situations. A third, more pessimistic, response is to argue that the conflict is genuine but is unresolvable. Kuhn (1977) takes this line, claiming that how much weight individual scientists give a particular theoretical virtue, such as simplicity, is solely a matter of taste, and is not open to rational resolution. McAllister (2007) draws ontological morals from a similar conclusion, arguing that sets of data typically exhibit multiple patterns, and that different patterns may be highlighted by different quantitative techniques. Aside from this issue of conflicting criteria, there are other problems with the statistical approach to simplicity. One problem, which afflicts any approach emphasizing the elegance aspect of simplicity, is language relativity. Crudely put, hypotheses which are syntactically very complex in one language may be syntactically very simple in another. The traditional philosophical illustration of this problem is Goodman's ‘grue’ challenge to induction. Are statistical approaches to the measurement of simplicity similarly language relative, and—if so—what justifies choosing one language over another? It turns out that the statistical approach has the resources to at least partially deflect the charge of language relativity. Borrowing techniques from information theory, it can be shown that certain syntactic measures of simplicity are asymptotically independent of choice of measurement language.http://plato.stanford.edu/entries/simplicity/notes.html#19 19 A second problem for the statistical approach is whether it can account not only for our preference for small numbers over large numbers (when it comes to picking values for coefficients or exponents in model equations), but also our preference for whole numbers and simple fractions over other values. In Gregor Mendel's original experiments on the hybridization of garden peas, he crossed pea varieties with different specific traits, such as tall versus short or green seeds versus yellow seeds, and then self-pollinated the hybrids for one or more generations.http://plato.stanford.edu/entries/simplicity/notes.html#20 20 In each case one trait was present in all the first-generation hybrids, but both traits were present in subsequent generations. Across his experiments with seven different such traits, the ratio of dominant trait to recessive trait averaged 2.98 : 1. On this basis, Mendel hypothesized that the true ratio is 3 : 1. This ‘rounding’ was made prior to the formulation of any explanatory model, hence it cannot have been driven by any theory-specific consideration. This raises two related questions. First, in what sense is the 3 : 1 ratio hypothesis simpler than the 2.98 : 1 ratio hypothesis? Second, can this choice be justified within the framework of the statistical approach to simplicity? The more general worry lying behind these questions is whether the statistical approach, in defining simplicity in terms of number of adjustable parameters, is replacing the broad issue of simplicity with a more narrowly—and perhaps arbitrarily—defined set of issues. A third problem with the statistical approach concerns whether it can shed any light on the specific issue of ontological parsimony. At first glance, one might think that the postulation of extra entities can be attacked on probabilistic grounds. For example, quantum mechanics together with the postulation ‘There exist unicorns’ is less probable than quantum mechanics alone, since the former logically entails the latter. However, as Sober has pointed out, it is important here to distinguish between agnostic Occam's Razor and atheistic Occam's Razor. Atheistic OR directs theorists to claim that unicorns do not exist, in the absence of any compelling evidence in their favor. And there is no relation of logical entailment between {QM + there exist unicorns} and {QM + there do not exist unicorns}. This also links back to the terminological issue. Models involving circular orbits are more parsimonious—in the statisticians' sense of ‘parsimonious’—than models involving elliptical orbits, but the latter models do not postulate the existence of any more things in the world. 6. Other Issues Concerning Simplicity This section addresses three distinct issues concerning simplicity and its relation to other methodological issues. These issues concern quantitative parsimony, plenitude, and induction. 6.1 Quantitative Parsimony Theorists tend to be frugal in their postulation of new entities. When a trace is observed in a cloud-chamber, physicists may seek to explain it in terms of the influence of a hitherto unobserved particle. But, if possible, they will postulate one such unobserved particle, not two, or twenty, or 207 of them. This desire to minimize the number of individual new entities postulated is often referred to as quantitative parsimony. David Lewis articulates the attitude of many philosophers when he writes: I subscribe to the general view that qualitative parsimony is good in a philosophical or empirical hypothesis; but I recognize no presumption whatever in favour of quantitative parsimony (Lewis 1973, p. 87). Is the initial assumption that one particle is acting to cause the observed trace more rational than the assumption that 207 particles are so acting? Or is it merely the product of wishful thinking, aesthetic bias, or some other non-rational influence? Nolan 1997 examines these questions in the context of the discovery of the neutrino.http://plato.stanford.edu/entries/simplicity/notes.html#21 21 Physicists in the 1930's were puzzled by certain anomalies arising from experiments in which radioactive atoms emit electrons during so-called Beta decay. In these experiments the total spin of the particles in the system before decay exceeds by 1/2 the total spin of the (observed) emitted particles. Physicists' response was to posit a ‘new’ fundamental particle, the neutrino, with spin 1/2 and to hypothesize that exactly one neutrino is emitted by each electron during Beta decay. Note that there is a wide range of very similar neutrino theories which can also account for the missing spin. H1: 1 neutrino with a spin of 1/2 is emitted in each case of Beta decay. H2: 2 neutrinos, each with a spin of 1/4 are emitted in each case of Beta decay. and, more generally, for any positive integer n, Hn: n neutrinos, each with a spin of 1/2n are emitted in each case of Beta decay. Each of these hypotheses adequately explains the observation of a missing 1/2-spin following Beta decay. Yet the most quantitatively parsimonious hypothesis, H1, is the obvious default choice.http://plato.stanford.edu/entries/simplicity/notes.html#22 22 One promising approach is to focus on the relative explanatory power of the alternative hypotheses, H1, H2, … Hn. When neutrinos were first postulated in the 1930's, numerous experimental set-ups were being devised to explore the products of various kinds of particle decay. In none of these experiments had cases of ‘missing’ 1/3-spin, or 1/4-spin, or 1/100-spin been found. The absence of these smaller fractional spins was a phenomenon which competing neutrino hypotheses might potentially help to explain. Consider the following two competing neutrino hypotheses: H1: 1 neutrino with a spin of 1/2 is emitted in each case of Beta decay. H10: 10 neutrinos, each with a spin of 1/20, are emitted in each case of Beta decay. Why has no experimental set-up yielded a ‘missing’ spin-value of 1/20? H1 allows a better answer to this question than H10 does, for H1 is consistent with a simple and parsimonious explanation, namely that there exist no particles with spin 1/20 (or less). In the case of H10, this potential explanation is ruled out because H10 explicitly postulates particles with spin 1/20. Of course, H10 is consistent with other hypotheses which explain the non-occurrence of missing 1/20-spin. For example, one might conjoin to H10 the law that neutrinos are always emitted in groups of ten. However, this would make the overall explanation less syntactically simple, and hence less virtuous in other respects. In this case, quantitative parsimony brings greater explanatory power. Less quantitatively parsimonious hypotheses can match this power only by adding auxiliary claims which decrease their syntactic simplicity. Thus the preference for quantitatively parsimonious hypotheses emerges as one facet of a more general preference for hypotheses with greater explanatory power. One distinctive feature of the neutrino example is that it is ‘additive.’ It involves postulating the existence of a collection of qualitatively identical objects which collectively explain the observed phenomenon. The explanation is additive in the sense that the overall phenomenon is explained by summing the individual positive contributions of each object. Whether the above approach can be extended to non-additive cases involving quantitative parsimony (for example, postulating the existence of unseen planets to explain an observed perturbation of an orbit) is unclear.http://plato.stanford.edu/entries/simplicity/notes.html#23 23 6.2 Principles of Plenitude Ranged against the principles of parsimony discussed in previous sections is an equally firmly rooted (though less well-known) tradition of what might be termed “principles of explanatory sufficiency.”http://plato.stanford.edu/entries/simplicity/notes.html#24 24 These principles have their origins in the same medieval controversies that spawned Occam's Razor. Ockham's contemporary, Walter of Chatton, proposed the following counter-principle to Occam's Razor: If three things are not enough to verify an affirmative proposition about things, a fourth must be added, and so on (quoted in Maurer 1984, p. 464). A related counter-principle was later defended by Kant: The variety of entities should not be rashly diminished (Kant 1950, p. 541). Entium varietates non temere esse minuendas. There is no inconsistency in the coexistence of these two families of principles, for they are not in direct conflict with each other. Considerations of parsimony and of explanatory sufficiency function as mutual counter-balances, penalizing theories which stray into explanatory inadequacy or ontological excess.http://plato.stanford.edu/entries/simplicity/notes.html#25 25 What we see here is an historical echo of the contemporary debate among statisticians concerning the proper trade-off between simplicity and goodness of fit. There is, however, a second family of principles which do appear directly to conflict with Occam's Razor. These are so-called ‘principles of plenitude.’ Perhaps the best-known version is associated with Leibniz, according to whom God created the best of all possible worlds with the greatest number of possible entities. More generally, a principle of plenitude claims that if it is possible for an object to exist then that object actually exists. Principles of plenitude conflict with Occam's Razor over the existence of physically possible but explanatorily idle objects. Our best current theories presumably do not rule out the existence of unicorns, but nor do they provide any support for their existence. According to Occam's Razor we ought not to postulate the existence of unicorns. According to a principle of plenitude we ought to postulate their existence. The rise of particle physics and quantum mechanics in the 20th Century led to various principles of plenitude being appealed to by scientists as an integral part of their theoretical framework. A particularly clear-cut example of such an appeal is the case of magnetic monopoles.http://plato.stanford.edu/entries/simplicity/notes.html#26 26 The 19th-century theory of electromagnetism postulated numerous analogies between electric charge and magnetic charge. One theoretical difference is that magnetic charges must always come in oppositely-charged pairs, called “dipoles” (as in the North and South poles of a bar magnet), whereas single electric charges, or “monopoles,” can exist in isolation. However, no actual magnetic monopole had ever been observed. Physicists began to wonder whether there was some theoretical reason why monopoles could not exist. It was initially thought that the newly developed theory of quantum mechanics ruled out the possibility of magnetic monopoles, and this is why none had ever been detected. However, in 1931 the physicist Paul Dirac showed that the existence of monopoles is consistent with quantum mechanics, although it is not required by it. Dirac went on to assert the existence of monopoles, arguing that their existence is not ruled out by theory and that “under these circumstances one would be surprised if Nature had made no use of it” (Dirac 1930, p. 71, note 5). This appeal to plenitude was widely—though not universally—accepted by other physicists. One of the elementary rules of nature is that, in the absence of laws prohibiting an event or phenomenon it is bound to occur with some degree of probability. To put it simply and crudely: anything that can happen does happen. Hence physicists must assume that the magnetic monopole exists unless they can find a law barring its existence (Ford 1963, p. 122). Others have been less impressed by Dirac's line of argument: Dirac's…line of reasoning, when conjecturing the existence of magnetic monopoles, does not differ from 18th-century arguments in favour of mermaids…As the notion of mermaids was neither intrinsically contradictory nor colliding with current biological laws, these creatures were assumed to exist.http://plato.stanford.edu/entries/simplicity/notes.html#27 27 It is difficult to know how to interpret these principles of plenitude. Quantum mechanics diverges from classical physics by replacing of a deterministic model of the universe with a model based on objective probabilities. According to this probabilistic model, there are numerous ways the universe could have evolved from its initial state, each with a certain probability of occurring that is fixed by the laws of nature. Consider some kind of object, say unicorns, whose existence is not ruled out by the initial conditions plus the laws of nature. Then one can distinguish between a weak and a strong version of the principle of plenitude. According to the weak principle, if there is a small finite probability of unicorns existing then given enough time and space unicorns will exist. According to the strong principle, it follows from the theory of quantum mechanics that if it is possible for unicorns to exist then they do exist. One way in which this latter principle may be cashed out is in the ‘many-worlds’ interpretation of quantum mechanics, according to which reality has a branching structure in which every possible outcome is realized. 6.3 Simplicity and Induction The problem of induction is closely linked to the issue of simplicity. One obvious link is between the curve-fitting problem and the inductive problem of predicting future outcomes from observed data. Less obviously, Schulte 1999 argues for a connection between induction and ontological parsimony. Schulte frames the problem of induction in information-theoretic terms: given a data-stream of observations of non-unicorns (for example), what general conclusion should be drawn? He argues for two constraints on potential rules. First, the rule should converge on the truth in the long run (so if no unicorns exist then it should yield this conclusion). Second, the rule should minimize the maximum number of changes of hypothesis, given different possible future observations. Schulte argues that the ‘Occam Rule’—conjecture that Ω does not exist until it has been detected in an experiment—is optimal relative to these constraints. An alternative rule—for example, conjecturing that Ω exists until 1 million negative results have been obtained—may result in two changes of hypothesis if, say, Ω's are not detected until the 2 millionth experiment. The Occam Rule leads to at most one change of hypothesis (when an Ω is first detected). (See also Kelly 2004.) With respect to the justification question, arguments have been made in both directions. Scientists are often inclined to justify simplicity principles on broadly inductive grounds. According to this argument, scientists select new hypotheses based partly on criteria that have been generated inductively from previous cases of theory choice. Choosing the most parsimonious of the acceptable alternative hypotheses has tended to work in the past. Hence scientists continue to use this as a rule of thumb, and are justified in so doing on inductive grounds. One might try to bolster this point of view by considering a counterfactual world in which all the fundamental constituents of the universe exist in pairs. In such a ‘pairwise’ world, scientists might well prefer pairwise hypotheses in general to their more parsimonious rivals. This line of argument has a couple of significant weaknesses. Firstly, one might legitimately wonder just how successful the choice of parsimonious hypotheses has been; examples from chemistry spring to mind, such as oxygen molecules containing two atoms rather than one. Secondly, and more importantly, there remains the issue of explaining why the preference for parsimonious hypotheses in science has been as successful as it has been. Making the justificatory argument in the reverse direction, from simplicity to induction, has a strong historical precedent in philosophical approaches to the problem of induction, from Hume onwards. Justifying the ‘straight rule’ of induction by appeal to some general Principle of Uniformity is an initially appealing response to the skeptical challenge. However, in the absence of a defense of the underlying Principle itself (and one which does not, on pain of circularity, depend inductively on past success), it is unclear how much progress this represents. There have also been attempts (see e.g. Steel 2009) to use simplicity considerations to respond to Nelson Goodman's ‘new riddle of induction